Jaime Carbonell ( Chair ) Tom Mitchell

نویسندگان

  • Tom Mitchell
  • Larry Wasserman
  • Robert Tibshirani
  • Xi Chen
  • Jingrong Wang
چکیده

The development of modern information technology has enabled collecting data of unprecedented size and complexity. Examples include web text data, microarray & proteomics, and data from scientific domains (e.g., meteorology). To learn from these high dimensional and complex data, traditional machine learning techniques often suffer from the curse of dimensionality and unaffordable computational cost. However, learning from large-scale high-dimensional data provides big payoffs in text mining, gene analysis, and numerous other consequential tasks. Recently developed sparse learning techniques provide us a suite of tools for understanding and exploring high dimensional data from many areas in science and engineering. By exploring sparsity, we can always learn a parsimonious and compact model which is more interpretable and computationally tractable at application time. When it is known that the underlying model is indeed sparse, sparse learning methods can provide us a more consistent model and much improved prediction performance. However, the existing methods are still insufficient for modeling complex or dynamic structures of the data, such as those evidenced in pathways of genomic data, gene regulatory network, and synonyms in text data. This thesis develops structured sparse learning methods along with scalable optimization algorithms to explore and predict high dimensional data with complex structures. In particular, we address three aspects of structured sparse learning: 1. Efficient and scalable optimization methods with fast convergence guarantees for a wide spectrum of high-dimensional learning tasks, including single or multi-task structured regression, canonical correlation analysis as well as online sparse learning. 2. Learning dynamic structures of different types of undirected graphical models, e.g., conditional Gaussian or conditional forest graphical models. 3. Demonstrating the usefulness of the proposed methods in various applications, e.g., computational genomics and spatial-temporal climatological data. In addition, we also design specialized sparse learning methods for text mining applications, including ranking and latent semantic analysis. In the last part of the thesis, we also present the future direction of the high-dimensional structured sparse learning from both computational and statistical aspects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Data to Knowledge to Action: Enabling Advanced Intelligence and Decision-Making for America’s Security

Large-scale machine learning can fundamentally transform the ability of intelligence analysts to efficiently extract important insights relevant to our nation’s security from the vast amounts of intelligence data being generated and collected worldwide. Intelligence organizations can tap into rapid data analytics innovations that Internet industries and university research organizations are mak...

متن کامل

AI Magazine Cumulative Index -- Volumes 1-4

D Davis, Randall. Expert Systems: Where are we? And where do we go from here? Expert Systems: Where are we? And where do we go from here?

متن کامل

CMU Report on TDT-2: Segmentation, Detection and Tracking

This paper reports the results achieved by Carnegie Mellon University on the Topic Detection and Tracking Project’s secondyear evaluation for the segmentation, detection, and tracking tasks. Additional post-evaluation improvements are also

متن کامل

CMU Approach to TDT-2: Segmentation, Detection, and Tracking

This paper reports the results achieved by Carnegie Mellon University on the Topic Detection and Tracking Project’s secondyear evaluation for the segmentation, detection, and tracking tasks. Additional post-evaluation improvements are also

متن کامل

Learning from Solution Paths: An Approach to the Credit Assignment Problem

In this article we discuss a method for learning useful conditions on the application of operators during heuristic search Since learning is not attempted until a complete solution path has been found for a problem, credit for correct moves and blame for incorrect moves is easily assigned We review four learning systems that have incorporated similar techniques to learn in the domains of algebr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013